342 PART 6 Analyzing Survival Data

Assessing goodness-of-fit and predictive

ability of the model

There are several measures of how well a regression model fits the survival data.

These measures can be useful when you’re choosing among several different

models:»

» Should you include a possible predictor variable (like age) in the model?»

» Should you include the squares or cubes of predictor variables in the model

(meaning including age2 or age3 in addition to age)?»

» Should you include a term for the interaction between two predictors?

Your software may offer one or more of the following goodness-of-fit measures:»

» A measure of agreement between the observed and predicted outcomes

called concordance (see the bottom of Figure 23-4). Concordance indicates the

extent to which participants with higher predicted hazard values had shorter

observed survival times, which is what you’d expect. Figure 23-4 shows a

concordance of 0.642 for this regression.»

» An r (or r2) value that’s interpreted like a correlation coefficient in ordinary

regression, meaning the larger the r2 value, the better the model fits the data.

In Figure 23-4, r2 (labeled Rsquare) is 0.116.»

» A likelihood ratio test and associated p value that compares the full model,

which includes all the parameters, to a model consisting of just the overall

baseline function. In Figure 23-4, the likelihood ratio p value is shown as

4 46

06

.

e

, which is scientific notation for p 0.00000446, indicating a model

that includes the CenterCD and Radiation variables can predict survival

statistically significantly better than just the overall (baseline) survival curve.»

» Akaike’s Information Criterion (AIC) is especially useful for comparing alternative

models but is not included in Figure 23-4.

Focusing on baseline survival

and hazard functions

The baseline survival function is represented as a table with two columns — time

and predicted survival — and a row for each distinct time at which one or more

events were observed.